1,228 research outputs found

    Expressive visual text to speech and expression adaptation using deep neural networks

    Get PDF
    In this paper, we present an expressive visual text to speech system (VTTS) based on a deep neural network (DNN). Given an input text sentence and a set of expression tags, the VTTS is able to produce not only the audio speech, but also the accompanying facial movements. The expressions can either be one of the expressions in the training corpus or a blend of expressions from the training corpus. Furthermore, we present a method of adapting a previously trained DNN to include a new expression using a small amount of training data. Experiments show that the proposed DNN-based VTTS is preferred by 57.9% over the baseline hidden Markov model based VTTS which uses cluster adaptive training

    Tensin1 expression and function in chronic obstructive pulmonary disease

    Get PDF
    open access articleChronic obstructive pulmonary disease (COPD) constitutes a major cause of morbidity and mortality. Genome wide association studies have shown significant associations between airflow obstruction or COPD with a non-synonymous SNP in the TNS1 gene, which encodes tensin1. However, the expression, cellular distribution and function of tensin1 in human airway tissue and cells are unknown. We therefore examined these characteristics in tissue and cells from controls and people with COPD or asthma. Airway tissue was immunostained for tensin1. Tensin1 expression in cultured human airway smooth muscle cells (HASMCs) was evaluated using qRT-PCR, western blotting and immunofluorescent staining. siRNAs were used to downregulate tensin1 expression. Tensin1 expression was increased in the airway smooth muscle and lamina propria in COPD tissue, but not asthma, when compared to controls. Tensin1 was expressed in HASMCs and upregulated by TGFβ1. TGFβ1 and fibronectin increased the localisation of tensin1 to fibrillar adhesions. Tensin1 and α-smooth muscle actin (αSMA) were strongly co-localised, and tensin1 depletion in HASMCs attenuated both αSMA expression and contraction of collagen gels. In summary, tensin1 expression is increased in COPD airways, and may promote airway obstruction by enhancing the expression of contractile proteins and their localisation to stress fibres in HASMCs

    Bird detection in audio : a survey and a challenge

    Get PDF
    Many biological monitoring projects rely on acoustic detection of birds. Despite increasingly large datasets, this detection is often manual or semi-automatic, requiring manual tuning/postprocessing. We review the state of the art in automatic bird sound detection, and identify a widespread need for tuning-free and species-agnostic approaches. We introduce new datasets and an IEEE research challenge to address this need, to make possible the development of fully automatic algorithms for bird sound detection

    Robust excitation-based features for Automatic Speech Recognition

    Get PDF
    In this paper we investigate the use of robust to noise features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for automatic speech recognition (ASR). The features are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation features expands the set of excitation features previously considered for ASR, expecting that these features help in a better discrimination of the broad phonetic classes (e.g., fricatives, nasal, vowels, etc.). Relative improvements in the word error rate are observed in the AMI meeting transcription system with greater gains (about 5%) if PLP features are combined with the suggested excitation features. For Aurora 4, significant improvements are observed as well. Combining the suggested excitation features with filter banks, a word error rate of 9.96% is achieved.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/ICASSP.2015.717885

    Dysphonia Detection based on modulation spectral features and cepstral coefficients

    Get PDF
    In this paper, we combine modulation spectral features with mel-frequency cepstral coefficients for automatic detection of dysphonia. For classification purposes, dimensions of the original modulation spectra are reduced using higher order singular value decomposition (HOSVD). Most relevant features are selected based on their mutual information to discrimination between normophonic and dysphonic speakers made by experts. Features that highly correlate with voice alterations are associated then with a support vector machine (SVM) classifier to provide an automatic decision. Recognition experiments using two different databases suggest that the system provides complementary information to the standard mel-cepstral feature

    A fixed dimension and perceptually based dynamic sinusoidal model of speech

    Get PDF
    This paper presents a fixed- and low-dimensional, perceptually based dynamic sinusoidal model of speech referred to as PDM (Perceptual Dynamic Model). To decrease and fix the number of sinusoidal components typically used in the standard sinusoidal model, we propose to use only one dynamic sinusoidal component per critical band. For each band, the sinusoid with the maximum spectral amplitude is selected and associated with the centre frequency of that critical band. The model is expanded at low frequencies by incorporating sinusoids at the boundaries of the corresponding bands while at the higher frequencies a modulated noise component is used. A listening test is conducted to compare speech reconstructed with PDM and state-of-the-art models of speech, where all models are constrained to use an equal number of parameters. The results show that PDM is clearly preferred in terms of quality over the other systems. Index Terms — Sinusoidal Model, Critical band, Vocoder 1

    Selective CO₂ capture in metal-organic frameworks with azine-functionalized pores generated by mechanosynthesis

    Get PDF
    Two new three-dimensional porous Zn(II)-based metal-organic frameworks, containing azine-functionalized pores, have been readily and quickly isolated via mechanosynthesis, by using a nonlinear dicarboxylate and linear N-donor ligands. The use of nonfunctionalized and methyl-functionalized N-donor ligands has led to the formation of frameworks with different topologies and metal-ligand connectivities and therefore different pore sizes and accessible volumes. Despite this, both metal-organic frameworks (MOFs) possess comparable BET surface areas and CO₂ uptakes at 273 and 298 K at 1 bar. The network with narrow and interconnected pores in three dimensions shows greater affinity for CO compared to the network with one-dimensional and relatively large pores-attributable to the more effective interactions with the azine groups
    corecore